Skip to content

Conversation

@Soul-Code
Copy link

@Soul-Code Soul-Code commented Jan 26, 2026

Add Somark tool plugin for converting documents (PDFs, images, etc.) into structured Markdown or JSON format using the Somark API.

Features:

  • Document extraction with OXR (Optical Everything Recognition) algorithm

  • Support for multiple file formats (PDF, PNG, JPG, etc.)

  • Configurable API endpoint and authentication

  • Max file size: 50MB/50 pages

Related Issues or Context

This PR contains Changes to Non-Plugin

  • Documentation
  • Other

This PR contains Changes to Non-LLM Models Plugin

  • I have Run Comprehensive Tests Relevant to My Changes

This PR contains Changes to LLM Models Plugin

  • My Changes Affect Message Flow Handling (System Messages and User→Assistant Turn-Taking)
  • My Changes Affect Tool Interaction Flow (Multi-Round Usage and Output Handling, for both Agent App and Agent Node)
  • My Changes Affect Multimodal Input Handling (Images, PDFs, Audio, Video, etc.)
  • My Changes Affect Multimodal Output Generation (Images, Audio, Video, etc.)
  • My Changes Affect Structured Output Format (JSON, XML, etc.)
  • My Changes Affect Token Consumption Metrics
  • My Changes Affect Other LLM Functionalities (Reasoning Process, Grounding, Prompt Caching, etc.)
  • Other Changes (Add New Models, Fix Model Parameters etc.)

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

  • I have Bumped Up the Version in Manifest.yaml (Top-Level Version Field, Not in Meta Section)

Dify Plugin SDK Version

  • I have Ensured dify_plugin>=0.3.0,<0.6.0 is in requirements.txt (SDK docs)

Environment Verification (If Any Code Changes)

Local Deployment Environment

  • Dify Version is: , I have Tested My Changes on Local Deployment Dify with a Clean Environment That Matches the Production Configuration.

SaaS Environment

  • I have Tested My Changes on cloud.dify.ai with a Clean Environment That Matches the Production Configuration

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 26, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Soul-Code, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the Somark DocAI platform as a new plugin within Dify, significantly enhancing its document processing capabilities. Users can now convert diverse document types, such as PDFs and images, into structured Markdown or JSON outputs. This integration provides advanced document understanding and data extraction, facilitating the incorporation of document content into LLM training, RAG systems, and intelligent agents.

Highlights

  • New Somark Plugin Integration: This pull request introduces a new Dify plugin for Somark, a DocAI platform designed for advanced document processing.
  • Document AI Capabilities: The plugin enables the conversion of various document types, including PDFs and images, into structured Markdown or JSON formats. It leverages Somark's proprietary 'OXR' (Optical Everything Recognition) algorithm for precise content extraction.
  • Seamless Workflow Integration: A new 'Extract Document' tool is provided, allowing users to easily integrate Somark's document parsing capabilities into their Dify workflows. Configuration requires a Somark API Key for authentication.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the enhancement New feature or request label Jan 26, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new somark plugin, which is a great addition. The code is well-structured, but I've found a few issues that should be addressed before merging. These include a critical bug in API URL construction, missing credential validation, potential runtime errors, and several inconsistencies in metadata and documentation. Addressing these points will improve the plugin's robustness and user experience.

Copy link
Member

@crazywoola crazywoola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

Add Somark tool plugin for converting documents (PDFs, images, etc.) into structured Markdown or JSON format using the Somark API.

Features:

- Document extraction with OXR (Optical Everything Recognition) algorithm

- Support for multiple file formats (PDF, PNG, JPG, etc.)

- Configurable API endpoint and authentication

- Max file size: 50MB/50 pages
1. Improve error handling and type hinting in extract tool

2. Add credential validation in provider

3. Ensure icon resource exists
@Soul-Code Soul-Code force-pushed the feat:somark-plugins branch from c46703a to ca66061 Compare January 30, 2026 06:02
@Soul-Code Soul-Code deployed to tools/somark January 30, 2026 06:02 — with GitHub Actions Active
@Soul-Code
Copy link
Author

Hi @crazywoola,

Thanks for the review! I've addressed all your comments. Please take another look when you have a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants